Implement postgres migration for contributor and leaderboard by codebestia · Pull Request #403 · SolFoundry/solfoundry

codebestia · 2026-03-21T16:53:02Z

Description

This PR migrates the contributor and leaderboard services from an in-memory storage MVP to full PostgreSQL persistence utilizing asyncpg and SQLAlchemy.

Specifically, this PR implements the following:

Transitions the contributor_service to utilize async PostgreSQL sessions, guaranteeing consistency.
Modifies the leaderboard_service to query ranks via the database instead of in-memory caches, with an added TTL cache layer to maintain <100ms performance queries.
Adds Alembic configurations alongside initial migration scripts (001_create_contributors_table) to build missing database tables safely.
Integrates robust seed scripts for pulling historical pull request/GitHub data back into the DB tables during cold starts.
Keeps all API Pydantic response schemas backwards-compatible so no frontend changes are required.

Closes #186

Solana Wallet for Payout

Wallet: 4QhseKvBuaCQhdkP248iXoUxohPzVC5m8pE9hAv4nMYw

Type of Change

🐛 Bug fix (non-breaking change which fixes an issue)
✨ New feature (non-breaking change which adds functionality)
💥 Breaking change (fix or feature that would cause existing functionality to not work as expected)
📝 Documentation update
🎨 Style/UI update
♻️ Code refactoring
⚡ Performance improvement
✅ Test addition/update

Checklist

Code is clean and follows the issue spec exactly
One PR per bounty (no multiple bounties in one PR)
Tests included for new functionality
All existing tests pass
No console.log or debugging code left behind
No hardcoded secrets or API keys

Testing

Manual testing performed
Unit tests added/updated
Integration tests added/updated

Screenshots (if applicable)

N/A

github-actions

Automated review: changes requested (see comment)

github-actions

Automated review: changes requested (see comment)

github-actions

Automated review: changes requested (see comment)

coderabbitai · 2026-03-21T17:03:31Z

Caution

Review failed

Pull request was closed or merged during review

📝 Walkthrough

Walkthrough

This PR implements a comprehensive PostgreSQL migration for contributor and leaderboard services, replacing in-memory storage with SQLAlchemy ORM models (ContributorTable, ReputationHistoryDB). It adds Alembic configuration and migrations, refactors contributor_service and leaderboard_service to use async database operations with Redis caching, introduces a GitHub-based seed script for initial data import, and updates tests to validate persistence and database behavior. API schemas remain backward compatible while storage transitions from ephemeral in-memory to durable PostgreSQL. Associated configuration updates include Alembic logging setup, SQLite engine pooling adjustments, and modified database initialization imports.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~75 minutes

Possibly related issues

🏭 Bounty: Contributor & Leaderboard PostgreSQL Migration — 500,000 $FNDRY #186: Directly addresses the bounty requirements for PostgreSQL migration of contributor and leaderboard services, including SQLAlchemy models, Alembic migrations, async service rewrites, database-backed caching, and seed scripts.
🏭 Bounty: PostgreSQL Full Migration — 500,000 $FNDRY #162: Related vector-matched issue addressing similar PostgreSQL persistence concerns, though 🏭 Bounty: Contributor & Leaderboard PostgreSQL Migration — 500,000 $FNDRY #186 is the primary scope of this PR.

Possibly related PRs

feat: Contributor Reputation System #250: Modifies contributor reputation features including API endpoints, reputation models/services, and contributor DB access/score update helpers that overlap with this PR's reputation service and contributor model changes.

Suggested labels

approved, paid

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title 'Implement postgres migration for contributor and leaderboard' accurately summarizes the main change: migrating contributor and leaderboard services to PostgreSQL persistence.
Description check	✅ Passed	The description comprehensively explains the migration from in-memory storage to PostgreSQL persistence, including service changes, Alembic configuration, seed scripts, and backward compatibility assurances.
Linked Issues check	✅ Passed	The PR addresses all coding requirements from issue `#186`: SQLAlchemy models for contributors [contributor.py], Alembic migration scripts [002_full_pg_persistence.py, alembic.ini], async PostgreSQL rewrite of contributor_service, leaderboard_service database querying with Redis caching, seed script [seed_contributors_from_github.py], backward-compatible schemas, and connection pooling.
Out of Scope Changes check	✅ Passed	All changes align with the PostgreSQL migration objectives. Updates to test fixtures [conftest.py, test_contributors.py, test_leaderboard.py, test_reputation.py], reputation_service lock conversion to asyncio, leaderboard schema field addition [top_skills], and database.py refactoring are all within scope of the migration effort.
Docstring Coverage	✅ Passed	Docstring coverage is 98.86% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 26

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)

backend/app/services/reputation_service.py (2)
394-415: ⚠️ Potential issue | 🟠 Major

History persistence and aggregate score updates can commit independently.

Lines 401-407 recompute and persist the contributor’s aggregate reputation_score, then Lines 410-415 persist the history entry in a separate step. If the history write fails, DB reads rebuilt from history no longer match the contributor row, and the inconsistency survives restarts because get_reputation() recomputes from history rather than the aggregate column.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/app/services/reputation_service.py` around lines 394 - 415, The
current flow appends to the in-memory _reputation_store, updates the aggregate
via contributor_service.update_reputation_score, then asynchronously persists
the history with persist_reputation_entry, which can leave the aggregate
out-of-sync if the history write fails; change the order and/or use a single
transactional persist so history is durable before committing the aggregate:
call persist_reputation_entry(entry) first (or wrap both
persist_reputation_entry and contributor_service.update_reputation_score in a DB
transaction), only update _reputation_store and call
contributor_service.update_reputation_score(round(total, 2)) after the history
write succeeds, and handle exceptions by not updating the aggregate when
persist_reputation_entry raises; also ensure get_reputation still recomputes
from history consistently.
503-522: ⚠️ Potential issue | 🟠 Major

The reputation leaderboard does an O(N) fan-out of repeated DB/history loads.

Lines 503-522 call get_reputation() once per contributor ID. Each call then re-fetches contributor data and reloads reputation history again, so leaderboard cost grows with contributor count and repeatedly scans the same history set instead of ranking in one query/pass.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/app/services/reputation_service.py` around lines 503 - 522, The
current loop calls get_reputation(contributor_id, include_history=False) for
each id, causing N repeated DB/history loads; instead implement a bulk path: add
a new bulk function (e.g., get_reputations_bulk or extend get_reputation to
accept a list) that takes all_ids and include_history flag and returns a list of
ReputationSummary objects by fetching contributor rows and reputation histories
in one query/pass, then replace the per-id loop in the leaderboard code (which
currently calls contributor_service.list_contributor_ids and get_reputation)
with a single call to that bulk function and then sort by (-reputation_score,
username) and slice offset:offset+limit before returning.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@backend/alembic.ini`:
- Around line 1-51: Resolve the Git conflict markers in the alembic.ini by
removing the "<<<<<<<", "=======", ">>>>>>>" blocks and keeping a single
coherent INI content under the [alembic] section (pick the correct
script_location value used by your project: either "migrations/alembic" or
"alembic"); also remove the hardcoded credentials in sqlalchemy.url (the
sqlalchemy.url key) and replace it with a safe placeholder or an empty fallback
(e.g., a driver://user:pass@host/dbname placeholder) so env.py can reliably read
DATABASE_URL at runtime, and ensure any logger/handler sections you keep are
valid INI syntax.

In `@backend/alembic/env.py`:
- Around line 28-36: The current fallback DATABASE_URL string with real-looking
credentials is unsafe; update env.py so database_url (the variable computed and
passed to config.set_main_option) does not default to "postgres:postgres".
Instead, if os.getenv("DATABASE_URL") is missing, either raise a clear runtime
error (fail fast) unless in a dev mode flag, or set a non-functional placeholder
value; implement this check before the startswith/replace logic so the code
either raises an exception with a helpful message or uses an explicit
INVALID_DATABASE_URL placeholder when not in development (use your existing
environment indicator or add one) and then call
config.set_main_option("sqlalchemy.url", database_url).

In `@backend/alembic/versions/001_create_contributors_table.py`:
- Around line 110-137: Add a database-level uniqueness constraint to the
reputation_history table to prevent duplicate rows for the same contributor and
bounty: update the op.create_table call for "reputation_history" (or add an
op.create_unique_constraint immediately after) to include a UniqueConstraint on
("contributor_id", "bounty_id") so the database enforces idempotency for
contributor_id + bounty_id and avoids race-condition inserts that the
reputation_service tries to deduplicate in application code.

In `@backend/app/api/contributors.py`:
- Around line 65-66: The current split of query params into skill_list and
badge_list returns raw tokens (e.g., ["python", " rust", ""]) which can break DB
filters; update the logic around skill_list and badge_list in contributors.py to
split on commas, trim whitespace from each token, drop any empty tokens, and
optionally normalize casing (e.g., lower()) before passing to DB; ensure you
still use parameterized queries or ORM filters to avoid injection risk when
using these cleaned lists.
- Around line 3-10: The file contains unresolved Git conflict markers (<<<<<<<,
=======, >>>>>>>) in the module docstring and inside multiple endpoint handlers
which prevents FastAPI from importing the router; remove all conflict markers
and merge the text so the module docstring is a single coherent sentence (choose
the intended description), then reconcile duplicated or conflicting handler
blocks by keeping the correct implementation for each endpoint (look for symbols
like router, any `@router.get/`@router.post/@router.put/@router.delete handlers,
and functions such as
get_contributor/create_contributor/update_contributor/delete_contributor or
reputation-related delegators) so there are no leftover markers or duplicate
definitions; after merging, ensure imports and the router variable are intact
and run the linter/tests to confirm FastAPI can import the module.

In `@backend/app/api/health.py`:
- Around line 52-58: The health endpoint currently calls
contributor_service.count_contributors() and accesses
bounty_service._bounty_store directly which can crash the endpoint; wrap the
contributor_service.count_contributors() call in a try/except to catch DB
errors, log the exception, and set a safe fallback (e.g., None or -1) and mark
the metric/state as degraded instead of letting the error propagate, and replace
direct access to _bounty_store by adding and using a public function on
bounty_service (e.g., get_bounty_count()) to obtain bounty metrics; keep
get_last_sync() usage as-is but ensure all failures are handled similarly so the
health response never raises an exception.

In `@backend/app/api/leaderboard.py`:
- Around line 43-45: The query parameter name "range: Optional[str] =
Query(...)" in leaderboard endpoint shadows Python's built-in range(); rename
the parameter (e.g., time_range: Optional[str] = Query(None,
description="Frontend range: 7d, 30d, 90d, all", alias="range")) and update all
uses inside the endpoint handler to use time_range (or chosen name); if you need
to keep the external query param as "range" use Query's alias argument as shown
so OpenAPI/client behavior is unchanged, and update any tests/docs that
reference the old name.

In `@backend/app/models/contributor.py`:
- Around line 95-110: ReputationHistoryDB lacks a uniqueness constraint for the
idempotency key used by record_reputation(), so duplicate (contributor_id,
bounty_id) rows can be inserted; add a composite UNIQUE constraint (or
UniqueConstraint) on (contributor_id, bounty_id) in the ReputationHistoryDB
model and update the DB migration so the database enforces idempotency at
write-time; ensure any existing code that upserts or catches IntegrityError in
record_reputation() handles duplicate-key errors gracefully after the migration.

In `@backend/app/seed_leaderboard.py`:
- Around line 100-106: seed_leaderboard currently uses
asyncio.get_event_loop().run_until_complete(async_seed_leaderboard()) which will
break when called from an existing event loop; change seed_leaderboard() into an
async def seed_leaderboard() that simply awaits async_seed_leaderboard(), and
update both call sites (where main.py's lifespan() and seed_database.py's
seed_all() invoke seed_leaderboard) to use await seed_leaderboard() instead of
calling it synchronously or using asyncio.run()/run_until_complete().

In `@backend/app/services/contributor_service.py`:
- Around line 181-224: The SELECT used to produce paginated contributor results
(base_query used to compute rows_result with .offset(skip).limit(limit)) lacks
an ORDER BY, causing nondeterministic paging; modify the query-building in the
contributor list flow to apply a deterministic order (e.g., base_query =
base_query.order_by(ContributorTable.id) or by a timestamp like
ContributorTable.created_at DESC) before applying offset/limit, and keep
count_query unchanged; ensure the same order_by is used consistently so paging
returns stable, repeatable pages.
- Around line 141-149: The insert path still allows a TOCTOU race to surface as
a 500 because commit/flush errors aren’t translated; wrap both await
session.flush() and the auto_session commit/refresh block in try/except that
catches SQLAlchemy integrity/unique constraint exceptions (e.g.
sqlalchemy.exc.IntegrityError or DBAPI-specific unique-violation), detect the
constraint for contributors.username, and raise/return the existing
duplicate/409 error (or translate to the same DuplicateUsername/HTTPConflict
used elsewhere) instead of letting the exception bubble; apply this change
around the symbols session.flush(),
async_session_factory()/auto_session.commit(), and auto_session.refresh(row).
- Around line 424-460: update_reputation_score mutates
ContributorTable.reputation_score but never invalidates the leaderboard cache
(leaderboard_service.get_leaderboard), causing stale reads; fix by calling the
leaderboard cache invalidation API after a successful update: invoke
leaderboard_service.invalidate_leaderboard() (or the existing cache invalidation
function) immediately after the update when using an external session (after
db_session.flush() where row.reputation_score is set) and also after the
auto-commit path (right after await auto_session.commit()); ensure the
invalidation call only runs when a row was actually modified.

In `@backend/app/services/github_sync.py`:
- Line 464: The skills list is being silently truncated with "skills[:10]" which
can lose contributor data; update the code around that assignment (the place
building the payload/dict with "skills": skills[:10]) to first check if
len(skills) > 10, and if so emit a warning using the module/service logger
(e.g., logger.warning or self.logger.warning) including the contributor
identifier (e.g., contributor_id or login) and the original count, then set the
field to skills[:10]; alternatively document the 10-item limit in the
function/class docstring or payload schema if truncation is intentional.
- Around line 458-473: The upsert_contributor calls inside the sync loop must be
executed inside a single DB transaction to avoid partial commits; modify the
sync routine to open a transaction/session before the loop (or use
contributor_service.transaction/context manager if available), perform all
upsert_contributor(...) calls within that transaction, and commit once all
upserts succeed (or rollback on error), and ensure refresh_store_cache() runs
only after commit; alternatively, catch exceptions around the transaction to log
failures and avoid leaving synced_count partial state.
- Around line 470-471: The snippet is forcibly setting "created_at": now -
timedelta(days=45) on every sync which overwrites real creation timestamps for
new contributors; change the logic in the sync path that prepares contributor
payloads so it only sets created_at for true seeding (or when the DB record is
absent) and otherwise derives created_at from the contributor's actual earliest
activity (e.g., earliest PR/commit date) or omits the field so the database
default/previous value remains; update the code that calls upsert_contributor()
to pass the computed earliest_date (or no created_at) instead of the fixed now -
timedelta value and add a short comment documenting this behavior.

In `@backend/app/services/leaderboard_service.py`:
- Around line 175-177: The period cutoff is being applied to
ContributorTable.created_at which removes users created before the cutoff
instead of limiting the earnings/reputation used for ranking; change the logic
so the cutoff filters activity/earnings entries used to compute total_earnings
and reputation_score (e.g., apply _period_cutoff(period) to the
earnings/activity table timestamp column or to the subquery/aggregation that
produces total_earnings and reputation_score) rather than
ContributorTable.created_at so rankings at lines referencing total_earnings and
reputation_score reflect only the period window.
- Around line 109-112: The code is truncating floating reputation values by
casting row.reputation_score to int; update the response construction to
preserve the float (e.g., use float(row.reputation_score or 0.0)) wherever
reputation_score is currently cast to int (replace int(...) in the blocks that
build leaderboard entries and any other occurrence around reputation_score),
ensuring the API returns the full float value that matches the persisted/ranked
value.

In `@backend/app/services/reputation_service.py`:
- Around line 3-24: The file contains unresolved Git conflict markers (<<<<<<<,
=======, >>>>>>>) in the module docstring, helper docstrings, and multiple
function bodies which makes the module unimportable; open
backend/app/services/reputation_service.py, remove all conflict markers and
choose the correct content for each conflicted section (e.g., unify the module
docstring and the import block to either include logging and threading or
asyncio as needed for the current implementation), resolve each function body
conflict by keeping the intended logic (preserve calls to
contributor_service.update_reputation_score, reputation history handling, and
any async vs sync signatures), ensure imports match the chosen implementation
(replace or keep asyncio/logging/threading), run the linter/tests to verify
parsing and runtime behavior, and commit the cleaned file with no remaining
<<<<<<</======/>>>>>> markers.

In `@backend/requirements.txt`:
- Line 7: The requirements.txt contains a duplicate dependency entry for
"alembic>=1.13.0,<2.0.0"; remove the redundant line so the package is declared
only once (keep a single "alembic>=1.13.0,<2.0.0" entry) to eliminate
maintenance confusion and potential inconsistencies when updating dependencies.

In `@backend/scripts/seed_contributors_from_github.py`:
- Around line 111-136: The pagination loop that calls client.get and checks
response.status_code should implement retries with exponential backoff for
transient errors (handle 429 and 5xx) instead of immediately breaking; augment
the block around response = await client.get(url, headers=_headers(),
params=params) to retry a configurable number of times with increasing sleep,
inspect response.headers.get("X-RateLimit-Remaining") and sleep until reset or
backoff when remaining is low, and if retries are exhausted or a non-retriable
code is returned (e.g., persistent 4xx) log the full response and raise an
exception (or return a failure) rather than break to avoid silent partial data;
keep variables/flow using response, client.get, _headers(), params, per_page,
page, all_prs, and logger so you modify the existing loop and error handling
only.
- Around line 224-228: The bot-filtering is brittle: update the logic around the
author variable (where avatar is also derived) to perform case-insensitive
checks and broader bot matching; e.g., compute author_lower = author.lower() and
then skip when author_lower.endswith("[bot]") or "bot" in author_lower or
author_lower is in an expanded set like
{"dependabot","github-actions","renovate","snyk-bot","codecov"} (or use a small
regex for common bot patterns) so that the existing check using
endswith("[bot]") and the tuple ("dependabot","github-actions") is replaced with
this normalized, extensible test.

In `@backend/tests/conftest.py`:
- Around line 21-39: The custom global loop and accessor (_test_loop and
get_test_loop) duplicate pytest-asyncio behavior and have a wrong type hint;
change the annotation to Optional[asyncio.AbstractEventLoop] for _test_loop and
either remove this custom loop entirely or replace it with a proper pytest
fixture named event_loop that returns a new event loop for tests (using
asyncio.new_event_loop() and yielding/closing it) so all async tests and helpers
(e.g., run_async) use pytest-asyncio's loop; update call sites of get_test_loop
to use the pytest event_loop fixture or accept the fixture via parameters and
delete the global state and set_event_loop calls to avoid conflicts.

In `@backend/tests/test_contributors.py`:
- Around line 1-6: Update the module docstring in
backend/tests/test_contributors.py to accurately describe the test setup:
replace the incorrect "Uses an in-memory SQLite database for test isolation"
with a statement that the tests use the PostgreSQL engine provided by
app.database (the async PostgreSQL-backed contributor service). Keep the rest of
the docstring intact but ensure it references the PostgreSQL engine from
app.database rather than SQLite.

In `@backend/tests/test_leaderboard.py`:
- Around line 64-66: Remove the manual cache write to contributor_service._store
after calling upsert_contributor; instead invoke the service's cache-refresh
logic (call contributor_service.refresh_store_cache() after
run_async(contributor_service.upsert_contributor(row_data))) so the test uses
the same cache-population path as production and avoids duplicating internal
behavior (_store, upsert_contributor, refresh_store_cache).

In `@backend/tests/test_reputation.py`:
- Around line 72-101: The test helper is manually mutating the service cache by
calling contributor_service.upsert_contributor(...) then setting
contributor_service._store[cid] = row, causing duplicated cache-management
logic; either update the service method upsert_contributor to populate the
internal cache (_store) itself so tests only call upsert_contributor, or remove
the manual assignment and extract the pattern into a shared test fixture (e.g.,
a helper used by tests that returns both row and ContributorResponse) so tests
stop directly touching contributor_service._store; locate usages around
upsert_contributor, contributor_service._store, ContributorResponse, and
ContributorStats to implement the fix.

---

Outside diff comments:
In `@backend/app/services/reputation_service.py`:
- Around line 394-415: The current flow appends to the in-memory
_reputation_store, updates the aggregate via
contributor_service.update_reputation_score, then asynchronously persists the
history with persist_reputation_entry, which can leave the aggregate out-of-sync
if the history write fails; change the order and/or use a single transactional
persist so history is durable before committing the aggregate: call
persist_reputation_entry(entry) first (or wrap both persist_reputation_entry and
contributor_service.update_reputation_score in a DB transaction), only update
_reputation_store and call
contributor_service.update_reputation_score(round(total, 2)) after the history
write succeeds, and handle exceptions by not updating the aggregate when
persist_reputation_entry raises; also ensure get_reputation still recomputes
from history consistently.
- Around line 503-522: The current loop calls get_reputation(contributor_id,
include_history=False) for each id, causing N repeated DB/history loads; instead
implement a bulk path: add a new bulk function (e.g., get_reputations_bulk or
extend get_reputation to accept a list) that takes all_ids and include_history
flag and returns a list of ReputationSummary objects by fetching contributor
rows and reputation histories in one query/pass, then replace the per-id loop in
the leaderboard code (which currently calls
contributor_service.list_contributor_ids and get_reputation) with a single call
to that bulk function and then sort by (-reputation_score, username) and slice
offset:offset+limit before returning.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 56bac5d9-afd5-4a5e-a13d-d3be3040edc8

📥 Commits

Reviewing files that changed from the base of the PR and between a3af033 and 1438f59.

📒 Files selected for processing (24)

backend/alembic.ini
backend/alembic/__init__.py
backend/alembic/env.py
backend/alembic/script.py.mako
backend/alembic/versions/001_create_contributors_table.py
backend/alembic/versions/__init__.py
backend/app/api/contributors.py
backend/app/api/health.py
backend/app/api/leaderboard.py
backend/app/database.py
backend/app/models/contributor.py
backend/app/models/leaderboard.py
backend/app/seed_leaderboard.py
backend/app/services/contributor_service.py
backend/app/services/github_sync.py
backend/app/services/leaderboard_service.py
backend/app/services/reputation_service.py
backend/requirements.txt
backend/scripts/__init__.py
backend/scripts/seed_contributors_from_github.py
backend/tests/conftest.py
backend/tests/test_contributors.py
backend/tests/test_leaderboard.py
backend/tests/test_reputation.py

coderabbitai · 2026-03-21T17:03:34Z

backend/alembic.ini

+<<<<<<< ours
 [alembic]
 script_location = migrations/alembic
 # DB URL is overridden at runtime by env.py reading DATABASE_URL env var.
 # Fallback only used when DATABASE_URL is not set.
 sqlalchemy.url = postgresql+asyncpg://localhost/solfoundry
+=======
+# Alembic configuration for SolFoundry backend.
+#
+# Manages database schema migrations for PostgreSQL.
+# The connection URL is read from the DATABASE_URL environment variable
+# at runtime (see alembic/env.py).
+
+[alembic]
+script_location = alembic
+prepend_sys_path = .
+sqlalchemy.url = postgresql+asyncpg://postgres:postgres@localhost/solfoundry
+
+[loggers]
+keys = root,sqlalchemy,alembic
+
+[handlers]
+keys = console
+
+[formatters]
+keys = generic
+
+[logger_root]
+level = WARN
+handlers = console
+
+[logger_sqlalchemy]
+level = WARN
+handlers =
+qualname = sqlalchemy.engine
+
+[logger_alembic]
+level = INFO
+handlers =
+qualname = alembic
+
+[handler_console]
+class = StreamHandler
+args = (sys.stderr,)
+level = NOTSET
+formatter = generic
+
+[formatter_generic]
+format = %(levelname)-5.5s [%(name)s] %(message)s
+datefmt = %H:%M:%S
+>>>>>>> theirs


⚠️ Potential issue | 🔴 Critical

Unresolved merge conflict markers break Alembic configuration.

The file contains Git merge conflict markers (<<<<<<< ours, =======, >>>>>>> theirs) that will cause Alembic to fail when parsing the INI file. These must be resolved before merging.

Additionally, line 17 contains hardcoded database credentials (postgres:postgres@localhost). While the docstring states the URL is overridden at runtime via DATABASE_URL, committing default credentials is a security anti-pattern—consider using a placeholder like driver://user:pass@localhost/dbname or removing the fallback entirely.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@backend/alembic.ini` around lines 1 - 51, Resolve the Git conflict markers in the alembic.ini by removing the "<<<<<<<", "=======", ">>>>>>>" blocks and keeping a single coherent INI content under the [alembic] section (pick the correct script_location value used by your project: either "migrations/alembic" or "alembic"); also remove the hardcoded credentials in sqlalchemy.url (the sqlalchemy.url key) and replace it with a safe placeholder or an empty fallback (e.g., a driver://user:pass@host/dbname placeholder) so env.py can reliably read DATABASE_URL at runtime, and ensure any logger/handler sections you keep are valid INI syntax.

coderabbitai · 2026-03-21T17:03:34Z

backend/alembic/env.py

+database_url = os.getenv(
+    "DATABASE_URL",
+    "postgresql+asyncpg://postgres:postgres@localhost/solfoundry",
+)
+if database_url.startswith("postgresql://") and "+asyncpg" not in database_url:
+    database_url = database_url.replace(
+        "postgresql://", "postgresql+asyncpg://", 1
+    )
+config.set_main_option("sqlalchemy.url", database_url)


⚠️ Potential issue | 🟡 Minor

Hardcoded fallback credentials in migration environment.

The fallback DATABASE_URL on line 30 contains default credentials (postgres:postgres). While overridden at runtime, this poses a risk if the environment variable is accidentally unset in production. Consider failing explicitly when DATABASE_URL is missing in non-development environments, or use a clearly invalid placeholder.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@backend/alembic/env.py` around lines 28 - 36, The current fallback DATABASE_URL string with real-looking credentials is unsafe; update env.py so database_url (the variable computed and passed to config.set_main_option) does not default to "postgres:postgres". Instead, if os.getenv("DATABASE_URL") is missing, either raise a clear runtime error (fail fast) unless in a dev mode flag, or set a non-functional placeholder value; implement this check before the startswith/replace logic so the code either raises an exception with a helpful message or uses an explicit INVALID_DATABASE_URL placeholder when not in development (use your existing environment indicator or add one) and then call config.set_main_option("sqlalchemy.url", database_url).

coderabbitai · 2026-03-21T17:03:34Z

backend/alembic/versions/001_create_contributors_table.py

+    op.create_table(
+        "reputation_history",
+        sa.Column(
+            "id",
+            postgresql.UUID(as_uuid=True),
+            primary_key=True,
+            nullable=False,
+        ),
+        sa.Column(
+            "contributor_id",
+            postgresql.UUID(as_uuid=True),
+            sa.ForeignKey("contributors.id", ondelete="CASCADE"),
+            nullable=False,
+            index=True,
+        ),
+        sa.Column("bounty_id", postgresql.UUID(as_uuid=True), nullable=False),
+        sa.Column("bounty_title", sa.String(255), nullable=False),
+        sa.Column("bounty_tier", sa.Integer(), nullable=False),
+        sa.Column("review_score", sa.Float(), nullable=False),
+        sa.Column("earned_reputation", sa.Float(), nullable=False),
+        sa.Column("anti_farming_applied", sa.Boolean(), nullable=False, server_default=sa.text("false")),
+        sa.Column(
+            "created_at",
+            sa.DateTime(timezone=True),
+            nullable=True,
+            server_default=sa.func.now(),
+        ),
+    )


⚠️ Potential issue | 🟠 Major

Missing unique constraint on (contributor_id, bounty_id) in reputation_history.

The reputation_service implements idempotent duplicate handling in application code (returning existing entry if bounty_id matches). However, without a database-level unique constraint, concurrent requests could insert duplicate reputation records for the same bounty. Consider adding:

+ sa.UniqueConstraint("contributor_id", "bounty_id", name="uq_reputation_contributor_bounty"),

This enforces idempotency at the database level and prevents race conditions.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

op.create_table(

"reputation_history",

sa.Column(

"id",

postgresql.UUID(as_uuid=True),

primary_key=True,

nullable=False,

),

sa.Column(

"contributor_id",

postgresql.UUID(as_uuid=True),

sa.ForeignKey("contributors.id", ondelete="CASCADE"),

nullable=False,

index=True,

),

sa.Column("bounty_id", postgresql.UUID(as_uuid=True), nullable=False),

sa.Column("bounty_title", sa.String(255), nullable=False),

sa.Column("bounty_tier", sa.Integer(), nullable=False),

sa.Column("review_score", sa.Float(), nullable=False),

sa.Column("earned_reputation", sa.Float(), nullable=False),

sa.Column("anti_farming_applied", sa.Boolean(), nullable=False, server_default=sa.text("false")),

sa.Column(

"created_at",

sa.DateTime(timezone=True),

nullable=True,

server_default=sa.func.now(),

),

)

op.create_table(

"reputation_history",

sa.Column(

"id",

postgresql.UUID(as_uuid=True),

primary_key=True,

nullable=False,

),

sa.Column(

"contributor_id",

postgresql.UUID(as_uuid=True),

sa.ForeignKey("contributors.id", ondelete="CASCADE"),

nullable=False,

index=True,

),

sa.Column("bounty_id", postgresql.UUID(as_uuid=True), nullable=False),

sa.Column("bounty_title", sa.String(255), nullable=False),

sa.Column("bounty_tier", sa.Integer(), nullable=False),

sa.Column("review_score", sa.Float(), nullable=False),

sa.Column("earned_reputation", sa.Float(), nullable=False),

sa.Column("anti_farming_applied", sa.Boolean(), nullable=False, server_default=sa.text("false")),

sa.Column(

"created_at",

sa.DateTime(timezone=True),

nullable=True,

server_default=sa.func.now(),

),

sa.UniqueConstraint("contributor_id", "bounty_id", name="uq_reputation_contributor_bounty"),

)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@backend/alembic/versions/001_create_contributors_table.py` around lines 110 - 137, Add a database-level uniqueness constraint to the reputation_history table to prevent duplicate rows for the same contributor and bounty: update the op.create_table call for "reputation_history" (or add an op.create_unique_constraint immediately after) to include a UniqueConstraint on ("contributor_id", "bounty_id") so the database enforces idempotency for contributor_id + bounty_id and avoids race-condition inserts that the reputation_service tries to deduplicate in application code.

backend/app/api/contributors.py

coderabbitai · 2026-03-21T17:03:34Z

backend/app/api/contributors.py

    skill_list = skills.split(",") if skills else None
    badge_list = badges.split(",") if badges else None


⚠️ Potential issue | 🟡 Minor

Comma-separated filters are forwarded without normalization.

Lines 65-66 split raw query strings but do not trim whitespace or drop empty tokens. Requests like ?skills=python, rust, become ["python", " rust", ""], which flows straight into the DB filters and can suppress otherwise valid matches. As per coding guidelines, backend/**: Analyze thoroughly: Input validation and SQL injection vectors.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@backend/app/api/contributors.py` around lines 65 - 66, The current split of query params into skill_list and badge_list returns raw tokens (e.g., ["python", " rust", ""]) which can break DB filters; update the logic around skill_list and badge_list in contributors.py to split on commas, trim whitespace from each token, drop any empty tokens, and optionally normalize casing (e.g., lower()) before passing to DB; ensure you still use parameterized queries or ORM filters to avoid injection risk when using these cleaned lists.

coderabbitai · 2026-03-21T17:03:35Z

backend/scripts/seed_contributors_from_github.py

+        author = pr.get("user", {}).get("login", "unknown")
+        avatar = pr.get("user", {}).get("avatar_url", "")
+
+        if author.endswith("[bot]") or author in ("dependabot", "github-actions"):
+            continue


🧹 Nitpick | 🔵 Trivial

Bot filtering may miss edge cases.

The bot exclusion logic checks for [bot] suffix and exact matches for dependabot/github-actions. Consider:

The [bot] check is case-sensitive (endswith("[bot]")) — GitHub bot accounts may vary in casing

Other common bots like renovate, codecov, snyk-bot are not filtered

- if author.endswith("[bot]") or author in ("dependabot", "github-actions"): + if author.lower().endswith("[bot]") or author.lower() in ( + "dependabot", "github-actions", "renovate", "codecov", "snyk-bot" + ):

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@backend/scripts/seed_contributors_from_github.py` around lines 224 - 228, The bot-filtering is brittle: update the logic around the author variable (where avatar is also derived) to perform case-insensitive checks and broader bot matching; e.g., compute author_lower = author.lower() and then skip when author_lower.endswith("[bot]") or "bot" in author_lower or author_lower is in an expanded set like {"dependabot","github-actions","renovate","snyk-bot","codecov"} (or use a small regex for common bot patterns) so that the existing check using endswith("[bot]") and the tuple ("dependabot","github-actions") is replaced with this normalized, extensible test.

coderabbitai · 2026-03-21T17:03:35Z

backend/tests/conftest.py

+# Shared event loop for all tests that need synchronous async execution
+_test_loop: asyncio.AbstractEventLoop = None  # type: ignore
+
+
+def get_test_loop() -> asyncio.AbstractEventLoop:
+    """Return the shared test event loop, creating it if needed.
+
+    This ensures all synchronous test helpers (``run_async``) use the
+    same event loop, avoiding 'no current event loop' errors when
+    running the full test suite.
+
+    Returns:
+        The shared asyncio event loop for tests.
+    """
+    global _test_loop
+    if _test_loop is None or _test_loop.is_closed():
+        _test_loop = asyncio.new_event_loop()
+        asyncio.set_event_loop(_test_loop)
+    return _test_loop


⚠️ Potential issue | 🟡 Minor

Custom event loop management may conflict with pytest-asyncio.

The custom _test_loop global with get_test_loop() duplicates functionality that pytest-asyncio provides via its event_loop fixture. This could lead to conflicts if async tests (@pytest.mark.asyncio) use a different loop than run_async() calls.

Additionally, line 22 has an incorrect type annotation: _test_loop: asyncio.AbstractEventLoop = None should be Optional[asyncio.AbstractEventLoop] = None.

Consider using pytest-asyncio's event_loop_policy fixture or scoped event_loop fixture instead of rolling your own loop management.

Fix type annotation

+from typing import Optional + # Shared event loop for all tests that need synchronous async execution -_test_loop: asyncio.AbstractEventLoop = None # type: ignore +_test_loop: Optional[asyncio.AbstractEventLoop] = None

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@backend/tests/conftest.py` around lines 21 - 39, The custom global loop and accessor (_test_loop and get_test_loop) duplicate pytest-asyncio behavior and have a wrong type hint; change the annotation to Optional[asyncio.AbstractEventLoop] for _test_loop and either remove this custom loop entirely or replace it with a proper pytest fixture named event_loop that returns a new event loop for tests (using asyncio.new_event_loop() and yielding/closing it) so all async tests and helpers (e.g., run_async) use pytest-asyncio's loop; update call sites of get_test_loop to use the pytest event_loop fixture or accept the fixture via parameters and delete the global state and set_event_loop calls to avoid conflicts.

coderabbitai · 2026-03-21T17:03:35Z

backend/tests/test_contributors.py

+"""Tests for contributor profiles API with PostgreSQL persistence.

-Tests verify CRUD operations through the HTTP API layer. All service
-functions are async (DB-first reads) and are exercised via the ASGI
-event loop through TestClient. DB reads are stubbed out to prevent
-cross-test contamination from shared SQLite in-memory databases.
+Verifies that the contributor CRUD endpoints work correctly against
+the async PostgreSQL-backed contributor service.  Uses an in-memory
+SQLite database for test isolation.
 """


⚠️ Potential issue | 🟡 Minor

Docstring incorrectly references SQLite.

The docstring states "Uses an in-memory SQLite database for test isolation" but the tests actually use the PostgreSQL engine from app.database. Update the docstring to reflect the actual implementation:

-Verifies that the contributor CRUD endpoints work correctly against -the async PostgreSQL-backed contributor service. Uses an in-memory -SQLite database for test isolation. +Verifies that the contributor CRUD endpoints work correctly against +the async PostgreSQL-backed contributor service. Uses the shared +test database with per-test row deletion for isolation.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

"""Tests for contributor profiles API with PostgreSQL persistence.

Tests verify CRUD operations through the HTTP API layer. All service

functions are async (DB-first reads) and are exercised via the ASGI

event loop through TestClient. DB reads are stubbed out to prevent

cross-test contamination from shared SQLite in-memory databases.

Verifies that the contributor CRUD endpoints work correctly against

the async PostgreSQL-backed contributor service. Uses an in-memory

SQLite database for test isolation.

"""

"""Tests for contributor profiles API with PostgreSQL persistence.

Verifies that the contributor CRUD endpoints work correctly against

the async PostgreSQL-backed contributor service. Uses the shared

test database with per-test row deletion for isolation.

"""

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@backend/tests/test_contributors.py` around lines 1 - 6, Update the module docstring in backend/tests/test_contributors.py to accurately describe the test setup: replace the incorrect "Uses an in-memory SQLite database for test isolation" with a statement that the tests use the PostgreSQL engine provided by app.database (the async PostgreSQL-backed contributor service). Keep the rest of the docstring intact but ensure it references the PostgreSQL engine from app.database rather than SQLite.

coderabbitai · 2026-03-21T17:03:35Z

backend/tests/test_leaderboard.py

+    row = run_async(contributor_service.upsert_contributor(row_data))
+    contributor_service._store[str(row.id)] = row
+    return row


🧹 Nitpick | 🔵 Trivial

Manual _store population duplicates service behavior.

After calling upsert_contributor(), the test helper manually inserts the row into _store. This bypasses the service's refresh_store_cache() mechanism and could cause test/production behavior divergence if the service's caching logic changes. Consider calling refresh_store_cache() instead or ensuring the service handles cache population internally on upsert.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@backend/tests/test_leaderboard.py` around lines 64 - 66, Remove the manual cache write to contributor_service._store after calling upsert_contributor; instead invoke the service's cache-refresh logic (call contributor_service.refresh_store_cache() after run_async(contributor_service.upsert_contributor(row_data))) so the test uses the same cache-population path as production and avoids duplicating internal behavior (_store, upsert_contributor, refresh_store_cache).

coderabbitai · 2026-03-21T17:03:35Z

backend/tests/test_reputation.py

+    row = run_async(
+        contributor_service.upsert_contributor(
+            {
+                "id": uuid.UUID(cid),
+                "username": username,
+                "display_name": username,
+                "skills": ["python"],
+                "badges": [],
+                "social_links": {},
+                "total_contributions": 0,
+                "total_bounties_completed": 0,
+                "total_earnings": Decimal("0"),
+                "reputation_score": 0.0,
+                "created_at": now,
+                "updated_at": now,
+            }
+        )
+    )
+    contributor_service._store[cid] = row
    return ContributorResponse(
-        id=cid, username=username, display_name=username, skills=["python"],
-        badges=[], social_links={}, stats=ContributorStats(),
-        created_at=now, updated_at=now)
+        id=cid,
+        username=username,
+        display_name=username,
+        skills=["python"],
+        badges=[],
+        social_links={},
+        stats=ContributorStats(),
+        created_at=now,
+        updated_at=now,
+    )


🧹 Nitpick | 🔵 Trivial

Test helper duplicates _store cache management.

Similar to the leaderboard tests, _mc() manually inserts into _store after upserting. This pattern is repeated across test files and could be extracted into a shared fixture or the service could handle cache updates internally.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@backend/tests/test_reputation.py` around lines 72 - 101, The test helper is manually mutating the service cache by calling contributor_service.upsert_contributor(...) then setting contributor_service._store[cid] = row, causing duplicated cache-management logic; either update the service method upsert_contributor to populate the internal cache (_store) itself so tests only call upsert_contributor, or remove the manual assignment and extract the pattern into a shared test fixture (e.g., a helper used by tests that returns both row and ContributorResponse) so tests stop directly touching contributor_service._store; locate usages around upsert_contributor, contributor_service._store, ContributorResponse, and ContributorStats to implement the fix.

coderabbitai

Actionable comments posted: 9

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (4)

backend/app/database.py (1)
97-118: ⚠️ Potential issue | 🟠 Major

Schema bootstrap still fails open for DB-backed contributor features.

Lines 97-118 wrap the new contributor/reputation model imports and Base.metadata.create_all() in a blanket except that only logs a warning. After this migration, init_db() is building required persistence for contributor counting, leaderboard reads, seeding, and reputation writes, so a model-import or DDL failure should stop startup/tests instead of deferring the breakage to later requests.

As per coding guidelines, backend/**: "Analyze thoroughly: - Error handling and edge case coverage".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/app/database.py` around lines 97 - 118, The DB schema bootstrap
currently swallows all exceptions around the model imports and await
conn.run_sync(Base.metadata.create_all) (where ContributorTable,
ReputationHistoryDB, ContributorTable, BountyTable, SubmissionDB, etc. are
imported), which allows startup to continue with missing persistence; change
this to fail fast by removing the blanket except Exception that only logs a
warning and instead propagate the error (or log with logger.error and re-raise)
so init_db() aborts startup/tests on import/DDL failures; alternatively catch
only expected non-fatal errors and re-raise unexpected ones to ensure missing
contributor/reputation tables stop startup.
backend/app/services/reputation_service.py (2)
236-258: ⚠️ Potential issue | 🟠 Major

Tier gating and idempotency still depend on per-process cache state.

Lines 243-258 consult _reputation_store for duplicate bounty_id detection, unlocked-tier checks, and anti-farming. In a multi-worker deployment or any path that serves requests before hydration finishes, different processes can see different history and accept duplicate records or higher-tier submissions that PostgreSQL would already show as ineligible.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/app/services/reputation_service.py` around lines 236 - 258, The
current checks (duplicate bounty_id, allowed tier, anti-farming) rely on the
in-memory _reputation_store and can be inconsistent across processes; replace
those validations with authoritative DB queries/transactions: inside the
critical section (or a DB transaction using
contributor_service.get_contributor_db as needed) query PostgreSQL for existing
reputation/history rows for data.contributor_id (and for an existing bounty_id)
instead of reading _reputation_store, compute allowed tier using the persisted
history (i.e., call _allowed_tier_for_contributor with the DB-derived history)
and compute is_veteran from DB history as well, and enforce idempotency/tier
checks based on those results (or use SELECT ... FOR UPDATE/unique constraints +
handle conflicts) rather than the _reputation_store cache.
278-297: ⚠️ Potential issue | 🔴 Critical

The write path can acknowledge success without persisting the canonical event.

Lines 278-297 update the in-memory history and contributor aggregate before attempting persist_reputation_entry(entry), then downgrade any history-write failure to a log line. get_reputation() later recomputes totals from history rows (Lines 321-353), so the API can return success even though the next DB-backed read/restart drops that reputation change.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/app/services/reputation_service.py` around lines 278 - 297, The
current flow updates _reputation_store and calls
contributor_service.update_reputation_score before persisting, which can cause
in-memory state to diverge if persist_reputation_entry(entry) fails; change the
flow so you call persist_reputation_entry(entry) first and only update
_reputation_store and call contributor_service.update_reputation_score after a
successful persist, or if you must update memory first then ensure you rollback
the in-memory append and revert the contributor score when
persist_reputation_entry raises; locate the logic around
_reputation_store.setdefault(data.contributor_id, []).append(entry), the
subsequent round(total, 2) update via
contributor_service.update_reputation_score, and the persist_reputation_entry
call and implement the transactional ordering or compensating rollback to ensure
get_reputation() will reflect only persisted entries.
backend/app/api/leaderboard.py (1)
39-107: ⚠️ Potential issue | 🟠 Major

The endpoint no longer exposes the structured leaderboard schema it asks the service to build.

backend/app/services/leaderboard_service.py:229-304 still returns a LeaderboardResponse, but Lines 79-107 always collapse it into a bare array and discard period, total, offset, limit, and top3. That is an API contract change for callers using the backend-style period path, and it conflicts with the PR objective to keep response schemas backward-compatible.

As per coding guidelines, backend/**: "Analyze thoroughly: - API contract consistency with spec".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/app/api/leaderboard.py` around lines 39 - 107, The current endpoint
(function leaderboard) always returns a bare array built from result.entries and
drops the service's LeaderboardResponse fields; update leaderboard to preserve
the backend-style schema when callers use the backend param by checking whether
the period argument was provided (i.e., period is not None) and, if so, return a
JSONResponse containing the full structured payload: period, total, offset,
limit, top3 (from the LeaderboardResponse returned by get_leaderboard) and the
mapped contributors (built from result.entries); otherwise keep the existing
frontend-friendly array response. Ensure you reference get_leaderboard,
LeaderboardResponse, result.entries, and preserve all original fields (period,
total, offset, limit, top3) in the response.

♻️ Duplicate comments (5)

backend/requirements.txt (1)
7-7: ⚠️ Potential issue | 🟡 Minor

Duplicate Alembic pin remains in the manifest.

Lines 7 and 20 declare the same alembic>=1.13.0,<2.0.0 constraint. Pip will tolerate it, but dependency updates now have two authoritative-looking locations for the same package, and this duplication was already reported on an earlier revision.

Also applies to: 20-20
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/requirements.txt` at line 7, Remove the duplicate package constraint
so the requirements manifest contains only a single entry for alembic
(alembic>=1.13.0,<2.0.0); locate the second/extra occurrence of the exact string
"alembic>=1.13.0,<2.0.0" in requirements.txt and delete it, ensuring no other
duplicate package pins remain in the file.
backend/app/api/health.py (1)
52-69: ⚠️ Potential issue | 🟠 Major

/health still turns DB outages into 500s.

After Lines 49-50 already determine the database is disconnected, Lines 57-58 still call count_contributors(), and backend/app/services/contributor_service.py:570-593 executes a real SQL count. Any exception there aborts the endpoint instead of returning a degraded payload; the same block also imports bounty_service._bounty_store, so the health surface depends on a private in-memory implementation detail.

As per coding guidelines, backend/**: "Analyze thoroughly: - Error handling and edge case coverage".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/app/api/health.py` around lines 52 - 69, The health endpoint calls
contributor_service.count_contributors() and directly references
bounty_service._bounty_store even when db_status is "disconnected", which lets
DB errors raise 500s and leaks a private implementation detail; change logic in
the health handler to only call contributor_service.count_contributors() when
db_status == "connected" (or wrap the call in a try/except and fall back to a
safe default like None or 0 on error), and stop importing/using the private
_bounty_store—use a public API (e.g., add/use a
bounty_service.get_bounty_count() or similar) or compute a safe fallback count;
ensure get_last_sync() remains unchanged but handle its None case as already
done so the endpoint returns a degraded payload instead of raising on DB
failures.
backend/tests/conftest.py (1)
21-71: ⚠️ Potential issue | 🟠 Major

The shared test harness now leaks both DB state and loop state across the suite.

Lines 21-71 create one module-global event loop and one session-scoped SQLite schema, and backend/app/database.py:36-44 uses StaticPool for SQLite. That means every test shares the same in-memory database, so persisted contributor/leaderboard state bleeds between tests, while pytest-asyncio tests can still run on a different loop than run_async(), which is risky for module-level asyncio primitives such as backend/app/services/reputation_service.py:_reputation_lock.

Expected result: the search should show both run_async(...) call sites and async tests, confirming mixed loop strategies over a single shared SQLite database.
#!/bin/bash
set -euo pipefail

echo "=== shared loop helper in conftest ==="
CONFTEST_FILE="$(fd '^conftest\.py$' backend/tests | head -n1)"
sed -n '18,80p' "$CONFTEST_FILE"

echo
echo "=== SQLite engine configuration ==="
DB_FILE="$(fd '^database\.py$' backend/app | head -n1)"
sed -n '28,46p' "$DB_FILE"

echo
echo "=== async tests and synchronous run_async call sites ==="
rg -n -C2 '@pytest\.mark\.asyncio|async def test_|run_async\(' backend/tests --type py
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/tests/conftest.py` around lines 21 - 71, The conftest helpers leak a
single module-global event loop and a session-scoped in-memory SQLite schema
(symbols: _test_loop, get_test_loop, run_async, init_test_db) causing DB/state
and loop-sharing across tests; fix by removing the global _test_loop and
run_async wrapper, rely on pytest-asyncio's event_loop fixture for async tests
(or make run_async obtain the current loop via asyncio.get_event_loop()), and
change init_test_db to create a fresh database per test (function scope) or stop
using StaticPool in app.database so each test gets its own in-memory DB; also
ensure module-level asyncio primitives like _reputation_lock are instantiated
per-loop or lazily inside functions to avoid cross-test loop/lock reuse.
backend/app/seed_leaderboard.py (1)
100-106: ⚠️ Potential issue | 🔴 Critical

seed_leaderboard() still nests an event loop inside an async startup path.

Lines 100-106 call asyncio.get_event_loop().run_until_complete(...), while the docstring says this wrapper is used from main.py lifespan. If the current call site is still an async lifespan or async seeding function, this raises at runtime before any contributor rows are seeded.

Expected result: at least one seed_leaderboard() call appears inside an async lifecycle or seeding function; if so, the synchronous wrapper is unsafe.
#!/bin/bash
set -euo pipefail

echo "=== seed_leaderboard definition ==="
SEED_FILE="$(fd '^seed_leaderboard\.py$' backend | head -n1)"
sed -n '96,110p' "$SEED_FILE"

echo
echo "=== seed_leaderboard call sites ==="
rg -n -C3 '\bseed_leaderboard\s*\(' backend --type py

echo
echo "=== async startup / seeding contexts ==="
rg -n -C3 'async def .*lifespan|@asynccontextmanager|async def .*seed_all' backend --type py
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/app/seed_leaderboard.py` around lines 100 - 106, seed_leaderboard()
is a synchronous wrapper that calls
asyncio.get_event_loop().run_until_complete(async_seed_leaderboard()), which is
unsafe when invoked from an async lifespan/startup; remove this nested
event-loop pattern by eliminating the sync wrapper and updating call sites to
await async_seed_leaderboard() directly (e.g., replace usages in main.py
lifespan or any async seed functions with "await async_seed_leaderboard()"); if
a truly synchronous entrypoint is required, provide a safe alternative that
checks for a running loop and uses asyncio.run(async_seed_leaderboard()) only
when no loop is running or else raises a clear error—reference the functions
seed_leaderboard and async_seed_leaderboard and the main.py lifespan call sites
when applying the change.
backend/alembic/versions/001_create_contributors_table.py (1)
110-137: ⚠️ Potential issue | 🟠 Major

Missing unique constraint on (contributor_id, bounty_id) allows duplicate reputation entries.

Without a database-level unique constraint, concurrent requests could insert duplicate reputation records for the same bounty, even if application code attempts deduplication.

— This was flagged in a prior review. Add:
sa.UniqueConstraint("contributor_id", "bounty_id", name="uq_reputation_contributor_bounty"),
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/alembic/versions/001_create_contributors_table.py` around lines 110 -
137, Add a DB-level unique constraint to the reputation table to prevent
duplicate entries by modifying the op.create_table call for
"reputation_history": include sa.UniqueConstraint("contributor_id", "bounty_id",
name="uq_reputation_contributor_bounty") as one of the table elements in the
op.create_table invocation so the reputation_history table enforces uniqueness
on (contributor_id, bounty_id) at the database level.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@backend/alembic/env.py`:
- Around line 17-18: Alembic's autogenerate isn't seeing the reputation_history
table because only ContributorTable is imported into env.py; import the
ReputationHistoryDB model as well so target_metadata = Base.metadata includes
its schema. Add an import for ReputationHistoryDB (the model/class named
ReputationHistoryDB) alongside the existing from app.models.contributor import
ContributorTable so Alembic can detect and include future schema changes for
reputation_history.

In `@backend/alembic/versions/001_create_contributors_table.py`:
- Line 125: Add a database index on reputation_history.bounty_id to avoid full
table scans when queries filter by bounty_id; update the migration (in the
revision that defines reputation_history) to create an index for the bounty_id
column (e.g., using op.create_index with a name like
ix_reputation_history_bounty_id) and add the corresponding op.drop_index in the
downgrade to cleanly roll back; ensure the index creation references the
reputation_history table and the bounty_id column.
- Around line 104-108: Update the index creation for
ix_contributors_reputation_earnings so the physical index matches the query
ORDER BY used in leaderboard_service (total_earnings DESC, reputation_score
DESC): change the op.create_index call in the migration to create the index with
explicit DESC ordering for both columns using PostgreSQL-aware expressions
(e.g., pass column expressions like sa.text("total_earnings DESC") and
sa.text("reputation_score DESC") or use the Postgres-specific options) so the
index on the contributors table matches the DESC order used in
backend/app/services/leaderboard_service.py.

In `@backend/app/services/github_sync.py`:
- Around line 393-396: The function-level imports of contributor_service and
Decimal inside sync_contributors are acceptable to avoid circular import
problems but lack an explanatory note; add a short comment directly above the
in-function imports in sync_contributors explaining that these imports are
intentionally local to prevent circular imports (mention contributor_service and
Decimal by name) so future maintainers understand the reason and do not move
them to module scope.

In `@backend/app/services/leaderboard_service.py`:
- Around line 195-207: The category filter contains a redundant dialect check —
remove the if/else branching around db_session.bind.dialect.name and apply a
single query.where using cast(ContributorTable.skills,
String).like(f'%"{category.value}"%') so the filter logic is not duplicated;
update the code that currently references category, db_session,
ContributorTable.skills, and query.where to use one unified branch.
- Around line 179-193: Redundant dialect check: remove the
db_session.bind.dialect.name == "postgresql" conditional and collapse both
branches into a single statement that applies the badge filter; specifically, in
the code that builds tier_label and filters ContributorTable.badges, replace the
if/else block with one query = query.where(cast(ContributorTable.badges,
String).like(f'%"{tier_label}"%')) so only that single expression remains (refer
to tier_label, db_session.bind, ContributorTable.badges, and the
cast(...).like(...) call).

In `@backend/app/services/reputation_service.py`:
- Around line 44-47: The current hydration only merges when load_reputation()
returns truthy, leaving stale in-memory entries when the DB is empty; change the
logic in the hydration block (where load_reputation(), _reputation_lock, and
_reputation_store are used) to always update the in-memory store under
_reputation_lock: acquire _reputation_lock, clear or reset _reputation_store,
then update it with the loaded payload (treat None as an empty mapping) so an
empty DB state replaces prior in-memory state rather than preserving it.

In `@backend/scripts/seed_contributors_from_github.py`:
- Around line 29-30: The script currently mutates sys.path via
sys.path.insert(0, os.path.join(os.path.dirname(__file__), "..")), which is
fragile; remove this sys.path manipulation and switch to running the script as a
proper module or using package-relative imports. Replace the ad-hoc path hack by
restoring normal imports in seed_contributors_from_github.py (remove the
sys.path.insert call) and update how it is invoked (e.g., run with python -m
backend.scripts.seed_contributors_from_github from the project root or adjust
imports to use the package namespace) so the code relies on the package import
system rather than modifying sys.path at runtime. Ensure any import lines in the
script use the package-qualified names referenced by the repository instead of
relative filesystem hacks.

In `@backend/tests/test_leaderboard.py`:
- Around line 257-269: The timing assertion in
test_leaderboard_under_100ms_with_cache is flaky; change the test to be more
CI-tolerant by raising the threshold (e.g., from 100ms to 200ms) and adding a
small retry loop that runs run_async(get_leaderboard()) up to 3 times (with a
short sleep) and uses the minimum elapsed_ms for the assertion; keep references
to the test function name test_leaderboard_under_100ms_with_cache and the
get_leaderboard/run_async calls so the change is localized.

---

Outside diff comments:
In `@backend/app/api/leaderboard.py`:
- Around line 39-107: The current endpoint (function leaderboard) always returns
a bare array built from result.entries and drops the service's
LeaderboardResponse fields; update leaderboard to preserve the backend-style
schema when callers use the backend param by checking whether the period
argument was provided (i.e., period is not None) and, if so, return a
JSONResponse containing the full structured payload: period, total, offset,
limit, top3 (from the LeaderboardResponse returned by get_leaderboard) and the
mapped contributors (built from result.entries); otherwise keep the existing
frontend-friendly array response. Ensure you reference get_leaderboard,
LeaderboardResponse, result.entries, and preserve all original fields (period,
total, offset, limit, top3) in the response.

In `@backend/app/database.py`:
- Around line 97-118: The DB schema bootstrap currently swallows all exceptions
around the model imports and await conn.run_sync(Base.metadata.create_all)
(where ContributorTable, ReputationHistoryDB, ContributorTable, BountyTable,
SubmissionDB, etc. are imported), which allows startup to continue with missing
persistence; change this to fail fast by removing the blanket except Exception
that only logs a warning and instead propagate the error (or log with
logger.error and re-raise) so init_db() aborts startup/tests on import/DDL
failures; alternatively catch only expected non-fatal errors and re-raise
unexpected ones to ensure missing contributor/reputation tables stop startup.

In `@backend/app/services/reputation_service.py`:
- Around line 236-258: The current checks (duplicate bounty_id, allowed tier,
anti-farming) rely on the in-memory _reputation_store and can be inconsistent
across processes; replace those validations with authoritative DB
queries/transactions: inside the critical section (or a DB transaction using
contributor_service.get_contributor_db as needed) query PostgreSQL for existing
reputation/history rows for data.contributor_id (and for an existing bounty_id)
instead of reading _reputation_store, compute allowed tier using the persisted
history (i.e., call _allowed_tier_for_contributor with the DB-derived history)
and compute is_veteran from DB history as well, and enforce idempotency/tier
checks based on those results (or use SELECT ... FOR UPDATE/unique constraints +
handle conflicts) rather than the _reputation_store cache.
- Around line 278-297: The current flow updates _reputation_store and calls
contributor_service.update_reputation_score before persisting, which can cause
in-memory state to diverge if persist_reputation_entry(entry) fails; change the
flow so you call persist_reputation_entry(entry) first and only update
_reputation_store and call contributor_service.update_reputation_score after a
successful persist, or if you must update memory first then ensure you rollback
the in-memory append and revert the contributor score when
persist_reputation_entry raises; locate the logic around
_reputation_store.setdefault(data.contributor_id, []).append(entry), the
subsequent round(total, 2) update via
contributor_service.update_reputation_score, and the persist_reputation_entry
call and implement the transactional ordering or compensating rollback to ensure
get_reputation() will reflect only persisted entries.

---

Duplicate comments:
In `@backend/alembic/versions/001_create_contributors_table.py`:
- Around line 110-137: Add a DB-level unique constraint to the reputation table
to prevent duplicate entries by modifying the op.create_table call for
"reputation_history": include sa.UniqueConstraint("contributor_id", "bounty_id",
name="uq_reputation_contributor_bounty") as one of the table elements in the
op.create_table invocation so the reputation_history table enforces uniqueness
on (contributor_id, bounty_id) at the database level.

In `@backend/app/api/health.py`:
- Around line 52-69: The health endpoint calls
contributor_service.count_contributors() and directly references
bounty_service._bounty_store even when db_status is "disconnected", which lets
DB errors raise 500s and leaks a private implementation detail; change logic in
the health handler to only call contributor_service.count_contributors() when
db_status == "connected" (or wrap the call in a try/except and fall back to a
safe default like None or 0 on error), and stop importing/using the private
_bounty_store—use a public API (e.g., add/use a
bounty_service.get_bounty_count() or similar) or compute a safe fallback count;
ensure get_last_sync() remains unchanged but handle its None case as already
done so the endpoint returns a degraded payload instead of raising on DB
failures.

In `@backend/app/seed_leaderboard.py`:
- Around line 100-106: seed_leaderboard() is a synchronous wrapper that calls
asyncio.get_event_loop().run_until_complete(async_seed_leaderboard()), which is
unsafe when invoked from an async lifespan/startup; remove this nested
event-loop pattern by eliminating the sync wrapper and updating call sites to
await async_seed_leaderboard() directly (e.g., replace usages in main.py
lifespan or any async seed functions with "await async_seed_leaderboard()"); if
a truly synchronous entrypoint is required, provide a safe alternative that
checks for a running loop and uses asyncio.run(async_seed_leaderboard()) only
when no loop is running or else raises a clear error—reference the functions
seed_leaderboard and async_seed_leaderboard and the main.py lifespan call sites
when applying the change.

In `@backend/requirements.txt`:
- Line 7: Remove the duplicate package constraint so the requirements manifest
contains only a single entry for alembic (alembic>=1.13.0,<2.0.0); locate the
second/extra occurrence of the exact string "alembic>=1.13.0,<2.0.0" in
requirements.txt and delete it, ensuring no other duplicate package pins remain
in the file.

In `@backend/tests/conftest.py`:
- Around line 21-71: The conftest helpers leak a single module-global event loop
and a session-scoped in-memory SQLite schema (symbols: _test_loop,
get_test_loop, run_async, init_test_db) causing DB/state and loop-sharing across
tests; fix by removing the global _test_loop and run_async wrapper, rely on
pytest-asyncio's event_loop fixture for async tests (or make run_async obtain
the current loop via asyncio.get_event_loop()), and change init_test_db to
create a fresh database per test (function scope) or stop using StaticPool in
app.database so each test gets its own in-memory DB; also ensure module-level
asyncio primitives like _reputation_lock are instantiated per-loop or lazily
inside functions to avoid cross-test loop/lock reuse.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 5a07b5ac-fb57-4301-90fe-c55d08d4e017

📥 Commits

Reviewing files that changed from the base of the PR and between 1438f59 and 9c72bbd.

📒 Files selected for processing (24)

backend/alembic.ini
backend/alembic/__init__.py
backend/alembic/env.py
backend/alembic/script.py.mako
backend/alembic/versions/001_create_contributors_table.py
backend/alembic/versions/__init__.py
backend/app/api/contributors.py
backend/app/api/health.py
backend/app/api/leaderboard.py
backend/app/database.py
backend/app/models/contributor.py
backend/app/models/leaderboard.py
backend/app/seed_leaderboard.py
backend/app/services/contributor_service.py
backend/app/services/github_sync.py
backend/app/services/leaderboard_service.py
backend/app/services/reputation_service.py
backend/requirements.txt
backend/scripts/__init__.py
backend/scripts/seed_contributors_from_github.py
backend/tests/conftest.py
backend/tests/test_contributors.py
backend/tests/test_leaderboard.py
backend/tests/test_reputation.py

coderabbitai · 2026-03-21T17:13:13Z

backend/alembic/env.py

+# Import all models so Alembic can detect them
+from app.models.contributor import ContributorTable  # noqa: F401


⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Check if ReputationHistoryDB exists in contributor.py and uses Base rg -n "class ReputationHistoryDB" backend/app/models/

Repository: SolFoundry/solfoundry

Length of output: 134

🏁 Script executed:

# Check current imports in env.py around lines 17-18 head -30 backend/alembic/env.py | cat -n

Repository: SolFoundry/solfoundry

Length of output: 1128

🏁 Script executed:

# Check if there are other model imports beyond what's shown rg -n "from app.models" backend/alembic/env.py

Repository: SolFoundry/solfoundry

Length of output: 133

🏁 Script executed:

# Check if ReputationHistoryDB and ContributorTable both inherit from Base rg -A 5 "class (ContributorTable|ReputationHistoryDB)" backend/app/models/contributor.py

Repository: SolFoundry/solfoundry

Length of output: 568

🏁 Script executed:

# Check migration files to see what tables are created find backend/alembic/versions -name "*.py" -type f | head -5 | xargs grep -l "reputation_history" 2>/dev/null || echo "No migration files found mentioning reputation_history"

Repository: SolFoundry/solfoundry

Length of output: 122

🏁 Script executed:

# Check the migration file to see what tables it creates cat -n backend/alembic/versions/001_create_contributors_table.py

Repository: SolFoundry/solfoundry

Length of output: 5451

🏁 Script executed:

# Check if there are any other migration files beyond the first one find backend/alembic/versions -name "*.py" -type f | sort

Repository: SolFoundry/solfoundry

Length of output: 159

🏁 Script executed:

# Check if any of those migrations use autogenerate pattern (look for `@revision` markers or revision info) rg -l "autogenerate" backend/alembic/versions/ 2>/dev/null || echo "No autogenerate marker found"

Repository: SolFoundry/solfoundry

Length of output: 93

🏁 Script executed:

# Verify both models are defined with Base rg -B 1 "class (ContributorTable|ReputationHistoryDB)" backend/app/models/contributor.py

Repository: SolFoundry/solfoundry

Length of output: 132

Import ReputationHistoryDB so Alembic autogenerate detects schema changes.

The migration creates a reputation_history table, but only ContributorTable is imported for Alembic's metadata detection. Since target_metadata = Base.metadata relies on imported models to populate the schema, ReputationHistoryDB must be imported for Alembic's autogenerate command to detect future changes to the reputation_history table:

# Import all models so Alembic can detect them from app.models.contributor import ContributorTable # noqa: F401 +from app.models.contributor import ReputationHistoryDB # noqa: F401

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

# Import all models so Alembic can detect them

from app.models.contributor import ContributorTable # noqa: F401

# Import all models so Alembic can detect them

from app.models.contributor import ContributorTable # noqa: F401

from app.models.contributor import ReputationHistoryDB # noqa: F401

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@backend/alembic/env.py` around lines 17 - 18, Alembic's autogenerate isn't seeing the reputation_history table because only ContributorTable is imported into env.py; import the ReputationHistoryDB model as well so target_metadata = Base.metadata includes its schema. Add an import for ReputationHistoryDB (the model/class named ReputationHistoryDB) alongside the existing from app.models.contributor import ContributorTable so Alembic can detect and include future schema changes for reputation_history.

coderabbitai · 2026-03-21T17:13:13Z

backend/alembic/versions/001_create_contributors_table.py

+    op.create_index(
+        "ix_contributors_reputation_earnings",
+        "contributors",
+        ["total_earnings", "reputation_score"],
+    )


🧹 Nitpick | 🔵 Trivial

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Check how leaderboard queries order results rg -n "order_by" backend/app/services/leaderboard_service.py

Repository: SolFoundry/solfoundry

Length of output: 100

🏁 Script executed:

# Read the leaderboard_service.py file around line 209 to see the actual order_by clause sed -n '200,220p' backend/app/services/leaderboard_service.py

Repository: SolFoundry/solfoundry

Length of output: 664

🏁 Script executed:

# Also search for any DESC patterns in leaderboard-related queries rg -n "DESC|desc" backend/app/services/leaderboard_service.py

Repository: SolFoundry/solfoundry

Length of output: 248

🏁 Script executed:

# Check the full leaderboard_service.py file to see all order_by usages rg -B 2 -A 2 "order_by" backend/app/services/leaderboard_service.py

Repository: SolFoundry/solfoundry

Length of output: 221

🏁 Script executed:

# Check if there are other leaderboard-related queries that might have different ordering rg -n "leaderboard" backend/app/services/ -i --type py

Repository: SolFoundry/solfoundry

Length of output: 2844

🏁 Script executed:

# Check the migration file to see the exact index definition cat -n backend/alembic/versions/001_create_contributors_table.py | sed -n '100,115p'

Repository: SolFoundry/solfoundry

Length of output: 627

Index column order does not match query pattern — consider explicit DESC ordering.

The index ix_contributors_reputation_earnings is created with default ascending order on ["total_earnings", "reputation_score"], but the leaderboard queries in backend/app/services/leaderboard_service.py (lines 209–211) use ORDER BY total_earnings DESC, reputation_score DESC. While PostgreSQL can use ascending indexes for descending queries via backward scans, this is less efficient than having the index match the query order.

The migration already documents the intended pattern in its comment; align the index definition with explicit DESC ordering using Alembic's PostgreSQL-specific options for optimal query performance.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@backend/alembic/versions/001_create_contributors_table.py` around lines 104 - 108, Update the index creation for ix_contributors_reputation_earnings so the physical index matches the query ORDER BY used in leaderboard_service (total_earnings DESC, reputation_score DESC): change the op.create_index call in the migration to create the index with explicit DESC ordering for both columns using PostgreSQL-aware expressions (e.g., pass column expressions like sa.text("total_earnings DESC") and sa.text("reputation_score DESC") or use the Postgres-specific options) so the index on the contributors table matches the DESC order used in backend/app/services/leaderboard_service.py.

coderabbitai · 2026-03-21T17:13:13Z

backend/alembic/versions/001_create_contributors_table.py

+            nullable=False,
+            index=True,
+        ),
+        sa.Column("bounty_id", postgresql.UUID(as_uuid=True), nullable=False),


🧹 Nitpick | 🔵 Trivial

Consider adding index on reputation_history.bounty_id for lookups.

The bounty_id column is used for idempotency checks (finding existing entries by bounty), but lacks an index. Queries filtering by bounty_id will require full table scans.

- sa.Column("bounty_id", postgresql.UUID(as_uuid=True), nullable=False), + sa.Column("bounty_id", postgresql.UUID(as_uuid=True), nullable=False, index=True),

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

sa.Column("bounty_id", postgresql.UUID(as_uuid=True), nullable=False),

sa.Column("bounty_id", postgresql.UUID(as_uuid=True), nullable=False, index=True),

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@backend/alembic/versions/001_create_contributors_table.py` at line 125, Add a database index on reputation_history.bounty_id to avoid full table scans when queries filter by bounty_id; update the migration (in the revision that defines reputation_history) to create an index for the bounty_id column (e.g., using op.create_index with a name like ix_reputation_history_bounty_id) and add the corresponding op.drop_index in the downgrade to cleanly roll back; ensure the index creation references the reputation_history table and the bounty_id column.

coderabbitai · 2026-03-21T17:13:13Z

backend/app/services/github_sync.py

+    """Sync merged PRs + known payouts → PostgreSQL + in-memory cache."""
+    from app.services import contributor_service
+    from decimal import Decimal
    import uuid


🧹 Nitpick | 🔵 Trivial

Function-level imports are acceptable but unconventional.

The imports for contributor_service and Decimal inside sync_contributors() avoid circular import issues but deviate from PEP 8 conventions. This is acceptable given the module's structure, but consider adding a brief comment explaining why.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@backend/app/services/github_sync.py` around lines 393 - 396, The function-level imports of contributor_service and Decimal inside sync_contributors are acceptable to avoid circular import problems but lack an explanatory note; add a short comment directly above the in-function imports in sync_contributors explaining that these imports are intentionally local to prevent circular imports (mention contributor_service and Decimal by name) so future maintainers understand the reason and do not move them to module scope.

backend/app/services/leaderboard_service.py

coderabbitai · 2026-03-21T17:13:13Z

backend/app/services/leaderboard_service.py

+        if category:
+            if db_session.bind and db_session.bind.dialect.name == "postgresql":
+                query = query.where(
+                    cast(ContributorTable.skills, String).like(
+                        f'%"{category.value}"%'
+                    )
+                )
+            else:
+                query = query.where(
+                    cast(ContributorTable.skills, String).like(
+                        f'%"{category.value}"%'
+                    )
+                )


🧹 Nitpick | 🔵 Trivial

Same redundant dialect check for category filter.

Lines 196-207 duplicate the same pattern: identical if/else branches. Simplify by removing the dialect check.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@backend/app/services/leaderboard_service.py` around lines 195 - 207, The category filter contains a redundant dialect check — remove the if/else branching around db_session.bind.dialect.name and apply a single query.where using cast(ContributorTable.skills, String).like(f'%"{category.value}"%') so the filter logic is not duplicated; update the code that currently references category, db_session, ContributorTable.skills, and query.where to use one unified branch.

coderabbitai · 2026-03-21T17:13:13Z

backend/app/services/reputation_service.py

    loaded = await load_reputation()
    if loaded:
-        with _reputation_lock:
+        async with _reputation_lock:
            _reputation_store.update(loaded)


⚠️ Potential issue | 🟡 Minor

Hydration never removes stale reputation history.

Lines 44-47 only merge load_reputation() results into _reputation_store, and they skip the update entirely when the DB returns an empty payload. A second hydration after deletions—or any empty DB state—leaves old contributors/history resident in memory even though PostgreSQL is supposed to be the source of truth.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@backend/app/services/reputation_service.py` around lines 44 - 47, The current hydration only merges when load_reputation() returns truthy, leaving stale in-memory entries when the DB is empty; change the logic in the hydration block (where load_reputation(), _reputation_lock, and _reputation_store are used) to always update the in-memory store under _reputation_lock: acquire _reputation_lock, clear or reset _reputation_store, then update it with the loaded payload (treat None as an empty mapping) so an empty DB state replaces prior in-memory state rather than preserving it.

coderabbitai · 2026-03-21T17:13:13Z

backend/scripts/seed_contributors_from_github.py

+# Ensure the backend package is importable
+sys.path.insert(0, os.path.join(os.path.dirname(__file__), ".."))


🧹 Nitpick | 🔵 Trivial

sys.path manipulation for imports is fragile.

Inserting .. into sys.path is a common workaround but can break if the script is invoked from unexpected directories. Consider using a proper package structure or running via python -m scripts.seed_contributors_from_github from the backend directory as documented.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@backend/scripts/seed_contributors_from_github.py` around lines 29 - 30, The script currently mutates sys.path via sys.path.insert(0, os.path.join(os.path.dirname(__file__), "..")), which is fragile; remove this sys.path manipulation and switch to running the script as a proper module or using package-relative imports. Replace the ad-hoc path hack by restoring normal imports in seed_contributors_from_github.py (remove the sys.path.insert call) and update how it is invoked (e.g., run with python -m backend.scripts.seed_contributors_from_github from the project root or adjust imports to use the package namespace) so the code relies on the package import system rather than modifying sys.path at runtime. Ensure any import lines in the script use the package-qualified names referenced by the repository instead of relative filesystem hacks.

coderabbitai · 2026-03-21T17:13:13Z

backend/tests/test_leaderboard.py

+def test_leaderboard_under_100ms_with_cache():
+    """Cached leaderboard response returns within 100ms target."""
+    for i in range(10):
+        _seed_contributor(
+            f"perf{i}", f"Perf {i}", total_earnings=float(100 * i)
+        )
+    run_async(get_leaderboard())  # warm the cache
+    start = time.time()
+    run_async(get_leaderboard())
+    elapsed_ms = (time.time() - start) * 1000
+    assert elapsed_ms < 100, (
+        f"Cached leaderboard took {elapsed_ms:.1f}ms (target <100ms)"
+    )


⚠️ Potential issue | 🟡 Minor

Timing-based test may be flaky in CI environments.

The assertion elapsed_ms < 100 depends on system performance. In resource-constrained CI runners, this could fail spuriously. Consider:

Increasing the threshold with margin (e.g., 200ms)

Using a retry mechanism

Marking as a performance benchmark rather than hard assertion

- assert elapsed_ms < 100, ( + # Allow headroom for CI variability; actual target is <100ms + assert elapsed_ms < 200, ( f"Cached leaderboard took {elapsed_ms:.1f}ms (target <100ms)" )

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@backend/tests/test_leaderboard.py` around lines 257 - 269, The timing assertion in test_leaderboard_under_100ms_with_cache is flaky; change the test to be more CI-tolerant by raising the threshold (e.g., from 100ms to 200ms) and adding a small retry loop that runs run_async(get_leaderboard()) up to 3 times (with a short sleep) and uses the minimum elapsed_ms for the assertion; keep references to the test function name test_leaderboard_under_100ms_with_cache and the get_leaderboard/run_async calls so the change is localized.

github-actions

Automated review: changes requested (see comment)

github-actions · 2026-03-21T17:26:44Z

❌ Multi-LLM Code Review — REQUEST CHANGES

Aggregated Score: 5.5/10 (median of 5 models)
Tier: tier-2 | Threshold: 7.0/10

Model Verdicts

Model	Score	Verdict
Sonnet 4.6	3.2/10	⚠️
DeepSeek V3.2	6.7/10	⚠️
Grok 4	5.4/10	⚠️
GPT-5.4	4.4/10	⚠️
Gemini 2.5 Pro	6.7/10	⚠️

Category Scores (Median)

Category	Score
Quality	█████░░░░░ 5.8/10
Correctness	███░░░░░░░ 3.8/10
Security	██████░░░░ 6.5/10
Completeness	█████░░░░░ 5.3/10
Tests	██████░░░░ 6.1/10
Integration	████░░░░░░ 4.6/10

Warning: CodeRabbit Critical Issues

CodeRabbit flagged 4 critical issues that appear unresolved.
These must be addressed before approval.

Warning: Bounty Spec Compliance: MINIMAL

Multiple models flagged that this submission does not fully meet the bounty acceptance criteria.

Summary

Sonnet 4.6: This submission has fundamental blocking issues with unresolved merge conflict markers in critical files that prevent the application from running. While the overall architecture and PostgreSQL migration approach is sound, the presence of Git conflict markers in alembic.ini, contributors.py, and reputation_service.py makes this submission non-functional.
DeepSeek V3.2: This PR implements a comprehensive PostgreSQL migration for contributor and leaderboard services with solid architectural design and good test coverage. However, it has critical unresolved merge conflicts that break functionality, incomplete acceptance criteria, and several database-level integrity issues that must be addressed before production deployment.
Grok 4: This PR attempts a comprehensive migration to PostgreSQL for contributors and leaderboard, including models, migrations, and service updates, but is severely undermined by unresolved critical issues like merge conflict markers that render key files unparsable. While it meets many bounty requirements such as async DB operations and seeding, the presence of these blockers prevents it from being production-ready. Overall, it shows good effort in refactoring but fails on basic code integrity and completeness due to unaddressed CodeRabbit findings.
GPT-5.4: This PR shows substantial effort toward moving contributor and leaderboard data to PostgreSQL, and it adds meaningful tests and service refactors. However, it is not mergeable in its current state: unresolved critical CodeRabbit findings indicate import/parsing failures, there are migration/model inconsistencies, and several integration issues mean the backend would likely break at runtime despite the breadth of changes.
Gemini 2.5 Pro: The PR successfully migrates the contributor and leaderboard services to a PostgreSQL backend, including models, migrations, and a full async rewrite of the service layer. However, the submission falls short of the T2 quality bar due to several significant correctness and security issues, including an unaddressed critical CodeRabbit finding, a missing database constraint creating a race condition, and hardcoded credentials.

Issues

[Sonnet 4.6] CRITICAL: Unresolved merge conflict markers in backend/alembic.ini, backend/app/api/contributors.py, and backend/app/services/reputation_service.py prevent the application from starting
[Sonnet 4.6] CRITICAL: Missing unique constraint on (contributor_id, bounty_id) in reputation_history table creates potential for duplicate reputation awards
[Sonnet 4.6] Missing Alembic migration files - the diff references migrations but they are not included in the PR
[Sonnet 4.6] Hardcoded fallback credentials in alembic/env.py pose security risk if DATABASE_URL is unset
[Sonnet 4.6] Health check endpoint lacks error handling for count_contributors() database call
[DeepSeek V3.2] Critical: Unresolved merge conflict markers in alembic.ini, contributors.py, and reputation_service.py break functionality and prevent imports
[DeepSeek V3.2] Missing database-level unique constraint on (contributor_id, bounty_id) for reputation_history table despite idempotency logic in service layer
[DeepSeek V3.2] Health check endpoint lacks error handling for database queries, causing full endpoint failure on DB issues
[DeepSeek V3.2] Username uniqueness has TOCTOU race condition between pre-check and database commit
[DeepSeek V3.2] Leaderboard performance target (<100ms) not verified with actual benchmarks or load testing
[Grok 4] Resolve merge conflict markers in alembic.ini, contributors.py, and reputation_service.py to make files parsable and functional.
[Grok 4] Address asyncio.get_event_loop deprecation in seed_leaderboard.py by using asyncio.get_running_loop or restructuring the async code.
[Grok 4] Add unique constraint on (contributor_id, bounty_id) in reputation_history table to enforce idempotency at the database level.
[Grok 4] Wrap database calls in health.py and add error handling to prevent endpoint failures on query errors.
[Grok 4] Fix TOCTOU race in contributor_service.py for username uniqueness by handling DB violations properly.

Suggestions

[Sonnet 4.6] Resolve all Git merge conflict markers before submission
[Sonnet 4.6] Add unique constraint on (contributor_id, bounty_id) in reputation_history table migration
[Sonnet 4.6] Include all referenced Alembic migration files in the PR
[Sonnet 4.6] Remove hardcoded credentials from fallback DATABASE_URL
[Sonnet 4.6] Add proper error handling to health check database operations
[DeepSeek V3.2] Resolve all merge conflict markers immediately to make the codebase functional
[DeepSeek V3.2] Add composite unique constraint on reputation_history(contributor_id, bounty_id) to enforce idempotency at database level
[DeepSeek V3.2] Wrap health check database calls in try-except blocks to prevent cascading failures
[DeepSeek V3.2] Implement database-level uniqueness constraint for username with proper conflict handling in service layer
[DeepSeek V3.2] Add performance benchmarks for leaderboard queries under realistic load conditions

Contributor stats: 6 merged bounty PRs, rep score 53

SolFoundry Multi-LLM Review Pipeline v3.0 — GPT-5.4 + Gemini 2.5 Pro + Grok 4 + Sonnet 4.6 + DeepSeek V3.2
Scoring: trimmed_mean across 5 models | ⚠️ High model disagreement (spread: 3.5)

Next Steps

Please address the issues above and push updated commits. The review will re-run automatically.

github-actions

Automated review: changes requested (see comment)

github-actions bot requested changes Mar 21, 2026

View reviewed changes

feat: implement postgres migration for contributor and leaderboard

0a963a6

codebestia force-pushed the fix/postgres-migration-186 branch from 1438f59 to 0a963a6 Compare March 21, 2026 16:56

github-actions bot requested changes Mar 21, 2026

View reviewed changes

coderabbitai bot reviewed Mar 21, 2026

View reviewed changes

chore: refinement

a414e79

codebestia force-pushed the fix/postgres-migration-186 branch from 9c72bbd to a414e79 Compare March 21, 2026 17:23

Merge branch 'main' into fix/postgres-migration-186

52ad8b9

github-actions bot requested changes Mar 21, 2026

View reviewed changes

codebestia closed this Mar 21, 2026

		skill_list = skills.split(",") if skills else None
		badge_list = badges.split(",") if badges else None

		# Import all models so Alembic can detect them
		from app.models.contributor import ContributorTable # noqa: F401

	sa.Column("bounty_id", postgresql.UUID(as_uuid=True), nullable=False),
	sa.Column("bounty_id", postgresql.UUID(as_uuid=True), nullable=False, index=True),

		# Ensure the backend package is importable
		sys.path.insert(0, os.path.join(os.path.dirname(__file__), ".."))

Conversation

codebestia commented Mar 21, 2026

Description

Solana Wallet for Payout

Type of Change

Checklist

Testing

Screenshots (if applicable)

Uh oh!

github-actions bot left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions bot left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot commented Mar 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review failed

Walkthrough

Estimated code review effort

Possibly related issues

Possibly related PRs

Suggested labels

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 21, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 21, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 21, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot Mar 21, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 21, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 21, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 21, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 21, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 21, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 21, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 21, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 21, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 21, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot Mar 21, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 21, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot commented Mar 21, 2026 •

edited

Loading